AITopics | data silo

Collaborating Authors

data silo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating drug discovery with Artificial: a whole-lab orchestration and scheduling system for self-driving labs

Fehlis, Yao, Mandel, Paul, Crain, Charles, Liu, Betty, Fuller, David

arXiv.org Artificial IntelligenceApr-1-2025

Accelerating drug discovery with Artificial: a whole-lab orchestration and scheduling system for self-driving labs Y ao Fehlis, Paul Mandel, Charles Crain, Betty Liu, David Fuller a a Artificial Inc.,Abstract Self-driving labs are transforming drug discovery by enabling automated, AI-guided experimentation, but they face challenges in orchestrating complex workflows, integrating diverse instruments and AI models, and managing data e fficiently. Artificial addresses these issues with a comprehensive orchestration and scheduling system that unifies lab operations, automates workflows, and integrates AI-driven decision-making. By incorporating AI / ML models like NVIDIA BioNeMo--which facilitates molecular interaction prediction and biomolecular analysis--Artificial enhances drug discovery and accelerates data-driven research. Through real-time coordination of instruments, robots, and personnel, the platform streamlines experiments, enhances reproducibility, and advances drug discovery. Introduction The landscape of drug discovery has long been characterized by a multitude of challenges, including the high costs of research and development, lengthy timelines, and a significant rate of failure during clinical trials (Blanco-Gonzalez et al., 2023; Udegbe et al., 2024; Khanna, 2012; Mo ffat et al., 2017).

artificial intelligence, machine learning, planning & scheduling, (16 more...)

arXiv.org Artificial Intelligence

2504.00986

Country: Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.90)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.87)

Add feedback

Contrastive Federated Learning with Tabular Data Silos

Ginanjar, Achmad, Li, Xue, Hua, Wen

arXiv.org Artificial IntelligenceSep-9-2024

Learning from data silos is a difficult task for organizations that need to obtain knowledge of objects that appeared in multiple independent data silos. Objects in multi-organizations, such as government agents, are referred by different identifiers, such as driver license, passport number, and tax file number. The data distributions in data silos are mostly non-IID (Independently and Identically Distributed), labelless, and vertically partitioned (i.e., having different attributes). Privacy concerns harden the above issues. Conditions inhibit enthusiasm for collaborative work. While Federated Learning (FL) has been proposed to address these issues, the difficulty of labeling, namely, label costliness, often hinders optimal model performance. A potential solution lies in contrastive learning, an unsupervised self-learning technique to represent semantic data by contrasting similar data pairs. However, contrastive learning is currently not designed to handle tabular data silos that existed within multiple organizations where data linkage by quasi identifiers are needed. To address these challenges, we propose using semi-supervised contrastive federated learning, which we refer to as Contrastive Federated Learning with Data Silos (CFL). Our approach tackles the aforementioned issues with an integrated solution. Our experimental results demonstrate that CFL outperforms current methods in addressing these challenges and providing improvements in accuracy. Additionally, we present positive results that showcase the advantages of our contrastive federated learning approach in complex client environments.

contrastive learning, learning, silo, (13 more...)

arXiv.org Artificial Intelligence

2409.06123

Country:

Oceania > Australia > Queensland (0.04)
Asia > China > Hong Kong (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Indonesia (0.04)

Genre:

Research Report > New Finding (0.89)
Research Report > Promising Solution (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Vessel Location Forecasting and the Effect of Federated Learning

Tritsarolis, Andreas, Pelekis, Nikos, Bereta, Konstantina, Zissis, Dimitris, Theodoridis, Yannis

arXiv.org Artificial IntelligenceMay-30-2024

The wide spread of Automatic Identification System (AIS) has motivated several maritime analytics operations. Vessel Location Forecasting (VLF) is one of the most critical operations for maritime awareness. However, accurate VLF is a challenging problem due to the complexity and dynamic nature of maritime traffic conditions. Furthermore, as privacy concerns and restrictions have grown, training data has become increasingly fragmented, resulting in dispersed databases of several isolated data silos among different organizations, which in turn decreases the quality of learning models. In this paper, we propose an efficient VLF solution based on LSTM neural networks, in two variants, namely Nautilus and FedNautilus for the centralized and the federated learning approach, respectively. We also demonstrate the superiority of the centralized approach with respect to current state of the art and discuss the advantages and disadvantages of the federated against the centralized approach.

dataset, fednautilus, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2405.1987

Country:

Europe > Greece > Attica > Athens (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos

Zhang, Tianyi, Cao, Yu, Liu, Dianbo

arXiv.org Artificial IntelligenceMar-1-2024

Federated learning (FL), aimed at leveraging vast distributed datasets, confronts a crucial challenge: the heterogeneity of data across different silos. While previous studies have explored discrete representations to enhance model generalization across minor distributional shifts, these approaches often struggle to adapt to new data silos with significantly divergent distributions. In response, we have identified that models derived from FL exhibit markedly increased uncertainty when applied to data silos with unfamiliar distributions. Consequently, we propose an innovative yet straightforward iterative framework, termed Uncertainty-Based Extensible-Codebook Federated Learning (UEFL). This framework dynamically maps latent features to trainable discrete vectors, assesses the uncertainty, and specifically extends the discretization dictionary or codebook for silos exhibiting high uncertainty. Our approach aims to simultaneously enhance accuracy and reduce uncertainty by explicitly addressing the diversity of data distributions, all while maintaining minimal computational overhead in environments characterized by heterogeneous data silos. Through experiments conducted on five datasets, our method has demonstrated its superiority, achieving significant improvements in accuracy (by 3%--22.1%) and uncertainty reduction (by 38.83%--96.24%), thereby outperforming contemporary state-of-the-art methods. The source code is available at https://github.com/destiny301/uefl.

codebook, codeword, uncertainty-based extensible-codebook federated learning, (11 more...)

arXiv.org Artificial Intelligence

2402.18888

Country:

North America > United States > Minnesota (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos

Wang, Di, Liu, Xiaotong, Lin, Shao-Bo, Zhou, Ding-Xuan

arXiv.org Machine LearningSep-8-2023

Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for the same purpose. Distributed learning based on divide-and-conquer provides a promising way to settle the data silos, but it suffers from several challenges, including autonomy, privacy guarantees, and the necessity of collaborations. This paper focuses on developing an adaptive distributed kernel ridge regression (AdaDKRR) by taking autonomy in parameter selection, privacy in communicating non-sensitive information, and the necessity of collaborations in performance improvement into account. We provide both solid theoretical verification and comprehensive experiments for AdaDKRR to demonstrate its feasibility and effectiveness. Theoretically, we prove that under some mild conditions, AdaDKRR performs similarly to running the optimal learning algorithms on the whole data, verifying the necessity of collaborations and showing that no other distributed learning scheme can essentially beat AdaDKRR under the same conditions. Numerically, we test AdaDKRR on both toy simulations and two real-world applications to show that AdaDKRR is superior to other existing distributed learning schemes. All these results show that AdaDKRR is a feasible scheme to defend against data silos, which are highly desired in numerous application regions such as intelligent decision-making, pricing forecasting, and performance prediction for products.

artificial intelligence, local machine, machine learning, (15 more...)

arXiv.org Machine Learning

2309.04236

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

Add feedback

The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector

Durrant, Aiden, Markovic, Milan, Matthews, David, May, David, Enright, Jessica, Leontidis, Georgios

arXiv.org Artificial IntelligenceMay-4-2023

Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also helping to adopt emerging machine learning technologies to boost productivity.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.compag.2021.106648

2104.07468

Country:

North America > United States > South Dakota (0.04)
North America > United States > North Dakota (0.04)
North America > United States > Nebraska (0.04)
(12 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Food & Agriculture > Agriculture (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Federated Alternate Training (FAT): Leveraging Unannotated Data Silos in Federated Segmentation for Medical Imaging

Mushtaq, Erum, Bakman, Yavuz Faruk, Ding, Jie, Avestimehr, Salman

arXiv.org Artificial IntelligenceApr-18-2023

Federated Learning (FL) aims to train a machine learning (ML) model in a distributed fashion to strengthen data privacy with limited data migration costs. It is a distributed learning framework naturally suitable for privacy-sensitive medical imaging datasets. However, most current FL-based medical imaging works assume silos have ground truth labels for training. In practice, label acquisition in the medical field is challenging as it often requires extensive labor and time costs. To address this challenge and leverage the unannotated data silos to improve modeling, we propose an alternate training-based framework, Federated Alternate Training (FAT), that alters training between annotated data silos and unannotated data silos. Annotated data silos exploit annotations to learn a reasonable global segmentation model. Meanwhile, unannotated data silos use the global segmentation model as a target model to generate pseudo labels for self-supervised learning. We evaluate the performance of the proposed framework on two naturally partitioned Federated datasets, KiTS19 and FeTS2021, and show its promising performance.

artificial intelligence, machine learning, silo, (13 more...)

arXiv.org Artificial Intelligence

2304.09327

Country:

North America > United States > California (0.14)
North America > United States > Minnesota (0.04)
Europe > Italy (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Integration remains key challenge for digital transformation

#artificialintelligenceFeb-25-2023, 04:35:37 GMT

It's a business pain point most know only too well, and new research confirms that integration challenges are not just a pain, they're slowing companies' digital ambitions and causing infrastructure issues and risks. MuleSoft's eighth annual Connectivity Benchmark Report shows the number of applications in Australian organisations (sorry, New Zealand, there are no Kiwi results in this one) have increased nearly 10 percent in the past year, to 1,032, highlighting the complexity of the digital landscape. But 68 percent of those applications are not integrated with other applications used by the business, creating data silos and the flow on effects, including increased costs, duplicated work, productivity bottlenecks and disconnected experiences. It's a situation that's proving costly – not just in terms of money spent building custom integrations (read on for those eye-watering figures) but also in the slowing of digital transformation efforts – something 84 percent of Australians said was happening, causing infrastructure and major risks as IT budgets come under increased scrutiny. And the cost of failing to complete digital transformation initiatives successfully?

australian organisation, digital transformation, integration, (16 more...)

#artificialintelligence

Country:

Oceania > New Zealand (0.26)
Oceania > Australia (0.05)

Genre: Research Report (0.58)

Industry: Information Technology (0.32)

Technology:

Information Technology > Data Science > Data Integration (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.53)
Information Technology > Architecture > Real Time Systems (0.34)

Add feedback

Are Data Silos Undermining Digital Transformation? - ReadWrite

#artificialintelligenceDec-8-2022, 21:17:20 GMT

At a time of seemingly ultrarapid digital disruptions, digital transformation in an enterprise needs a bold vision and an intent to embrace change. With the global digital transformation market projected to reach $2.8 trillion in 2025, leaders are expediting their transition to digital across their organizations. And as enterprises course-correct and adapt to specific strategies along this journey, they need a sound understanding of their data to drive informed decisions. The needed understanding of data-informed decisions is because high-quality data is at the heart of all digitalization initiatives, from delivering invaluable insights to and uncovering latent operational efficiency strategies. And that's the reason organizations' must get careful about the creation of data silos. Today 73.5% of most leading companies are data-driven in their decision-making.

data silo, data silo undermining digital transformation, silo, (10 more...)

#artificialintelligence

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Integration (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.50)
Information Technology > Data Science > Data Quality (0.49)

Add feedback

La veille de la cybersécurité

#artificialintelligenceOct-29-2022, 05:49:06 GMT

Artificial intelligence just doesn't pop up when you install tools and software. It takes planning and, most of all, it takes data. But getting the right data to make AI and machine learning algorithms -- and understanding it -- is where many organizations are slipping up, a recent study finds. Organizations face difficulties with data silos, explainability, and transparency, a study of 150 data executives commissioned by Capital One and Forrester Consulting finds. They say internal, cross-organizational, and external data silos slowed machine learning deployments and outcomes.

data scientist, data silo, veille, (1 more...)

#artificialintelligence

Genre: Research Report (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback